A tool for integrated analysis the liquid biopsy sequencing data
Install the github source code and ependencies below listed:
git clone https://github.com/ShangZhang/exVariance.git
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
conda update conda
conda install -c conda-forge mamba
mamba create -n exvariance4 -c conda-forge -c bioconda snakemake=5.14.0 r-base=3.6.3 -y
mamba install -c r -c conda-forge -c bioconda -c eugene_t r-argparse r-clustersim r-ggpubr bioconductor-scater bioconductor-scran bioconductor-singlecellexperiment bioconductor-sva bioconductor-edger bioconductor-ruvseq r-kbet r-devtools -y
devtools::install_github(c("hemberg-lab/scRNA.seq.funcs","theislab/kBET"))OR
install.packages(c("argparse","clusterSim","ggpubr","BiocManager","devtools")) BiocManager::install(c("scater","scran","SingleCellExperiment","sva","edgeR","RUVSeq")) devtools::install_github(c("hemberg-lab/scRNA.seq.funcs")) devtools::install_github(c("theislab/kBET"))
For easy installation, you can use the exVariance image of docker with all dependencies installed:
docker pull <exVariance_image>
Alternatively, you can use use singularity or udocker to run the container for Linux kernel < 3 or if you don't have permission to use docker.
exVariance is dependent on reference files which can be found for the supported species listed below: hg38
To unzip these files: tar -xzf hg19.tar.gz OR tar -xzf mm9.tar.gz
Run exVariance --help to get the usage:
usage: exVariance [-h] --user_config_file USER_CONFIG_FILE
[--cluster]
[--cluster-config CLUSTER_CONFIG]
[--cluster-command CLUSTER_COMMAND]
[--singularity SINGULARITY]
[--singularity-wrapper-dir SINGULARITY_WRAPPER_DIR]
{ RNA_seq_pre_process,RNA_seq_exp_matrix,
RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
DNA_seq_ctDNA_mutation,DNA_seq_NP,
DNA_meth_WGBS,DNA_meth_RRBS,
DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
}
exVariance is a tool for integrated analysis the liquid biopsy sequencing data.
positional arguments:
{ RNA_seq_pre_process,RNA_seq_exp_matrix,
RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
DNA_seq_ctDNA_mutation,DNA_seq_NP,
DNA_meth_WGBS,DNA_meth_RRBS,
DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
}
optional arguments:
-h, --help show this help message and exit
--user_config_file USER_CONFIG_FILE, -u USER_CONFIG_FILE
the user config file
--cluster submit to cluster
--cluster-config CLUSTER_CONFIG
cluster configuration file
--cluster-command CLUSTER_COMMAND
command for submitting job to cluster (default read
from {config_dir}/cluster_command.txt
--singularity SINGULARITY
singularity image file
--singularity-wrapper-dir SINGULARITY_WRAPPER_DIR
directory for singularity wrappers
positional arguments:
{ RNA_seq_pre_process,RNA_seq_exp_matrix,
RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
DNA_seq_ctDNA_mutation,DNA_seq_NP,
DNA_meth_WGBS,DNA_meth_RRBS,
DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
}
For additional help or support, please visit https://github.com/ShangZhang/exVariance
RNA-seq related examples can be found in demo directory with the following structure:
./demo/*/
|-- config
| |-- default_config.yaml
| |-- <data_name>.yaml
| |-- dapars_configure.txt
| `-- RNAEditor_configure.txt
|-- data
| |-- fastq/
| |-- sample_ids.txt
| |-- sample_classes.txt
| |-- compare_groups.yaml
| `-- batch_info.txt
|-- output
`-- summary
Other related examples can be found in demo directory with the following structure:
./demo/*/
|-- config
| |-- default_config.yaml
| `-- <data_name>.yaml
|-- data
| |-- fastq/
| `-- sample_ids.txt
|-- output
`-- summary
Note:
config/default_config.yaml: the default configuration file. If you don't understand, don't change the content.config/<data_name>.yaml: the user defined configuration file, to point out the related used path.data/fastq/: directory contain samples name, suffixed with 'fastq' 'fasta.gz' or 'fastq.gz'.data/sample_ids.txt: table of sample names (remove the suffix 'fastq' 'fasta.gz' or 'fastq.gz' )output/: the output directorysummary/: contain the summary files
You can create your own data directory with the above directory structure.
Multiple datasets can be put in the same directory by replacing "example" with your own dataset names.
exVariance -u <USER_CONFIG_FILE> RNA_seq_pre_process
exVariance -u <USER_CONFIG_FILE> RNA_seq_exp_matrix exVariance -u <USER_CONFIG_FILE> RNA_seq_fusion_transcripts exVariance -u <USER_CONFIG_FILE> RNA_seq_RNA_editing exVariance -u <USER_CONFIG_FILE> RNA_seq_SNP exVariance -u <USER_CONFIG_FILE> RNA_seq_APA exVariance -u <USER_CONFIG_FILE> RNA_seq_AS
exVariance -u <USER_CONFIG_FILE> DNA_meth_WGBS exVariance -u <USER_CONFIG_FILE> DNA_meth_RRBS exVariance -u <USER_CONFIG_FILE> DNA_meth_Seal_seq exVariance -u <USER_CONFIG_FILE> DNA_meth_Methyl-cap_seq exVariance -u <USER_CONFIG_FILE> DNA_meth_MeDIP_seq exVariance -u <USER_CONFIG_FILE> DNA_meth_MCTA_seq
exVariance -u <USER_CONFIG_FILE> DNA_seq_ctDNA_mutation exVariance -u <USER_CONFIG_FILE> DNA_seq_NP
Some of the tools that exVariance uses, e.g. STAR is very memory intensive programs. Therefore we recommend the following system requirements for exVariance:
We recommend that you run exVariance on a server that has at least 48GB of ram. This will allow for a single-threaded exVariance run (on human samples).
We recommend that you have at least 64GB of ram and at least a 4-core CPU if you want to run exVariance in multi-threaded mode (which will speedup the workflow significantly).
Our own servers have 64GB of ram and 16 cores.
Copyright (C) Lu Lab @ Tsinghua University, Beijing, China 2020 All rights reserved